[Config Refactor] HunyuanImage3 pipeline configs by lishunyang12 · Pull Request #2989 · vllm-project/vllm-omni

lishunyang12 · 2026-04-21T11:04:35Z

Summary

Continuation of RFC #2072. Migrates HunyuanImage-3.0 from the legacy vllm_omni/model_executor/stage_configs/hunyuan_image3_*.yaml files (7 yamls + 2 platform overlays) into the new pipeline.py (topology) + vllm_omni/deploy/<model>.yaml (deployment) split established by #2383.

Variant strategy

Five separate model_types — one per task — registered in _OMNI_PIPELINES. Precedent: qwen2_5_omni + qwen2_5_omni_thinker_only.

model_type	Topology	Default deploy yaml
`hunyuan_image3_t2i`	AR (stage 0) → DiT (stage 1) with KV transfer	`deploy/hunyuan_image3_t2i.yaml` (+ `_fp8.yaml`)
`hunyuan_image3_it2i`	AR (mm input, stage 0) → DiT (stage 1)	`deploy/hunyuan_image3_it2i.yaml`
`hunyuan_image3_dit_only`	DiT only (stage 0)	BYO — only CI yaml ships
`hunyuan_image3_i2t`	AR only (stage 0) — image+text → text	BYO
`hunyuan_image3_t2t`	AR only (stage 0) — text → text	BYO

dit_only / i2t / t2t carry only the pipeline.py topology — no default deploy yaml — because hardware sizing for those modes depends on the use case. Users bringing their own deploy yaml just point --pipeline hunyuan_image3_<variant> at it.

T2I path choice

T2I is registered as AR→DiT (matching the official bot_task="text" flow that produces a textual prompt for the DiT). For users wanting Tencent's bot_task="image" semantics (skip the AR side entirely), use --pipeline hunyuan_image3_dit_only with their own deploy yaml. The two paths produce different image quality / latency trade-offs; AR→DiT is the default because it matches the headline modality demonstrated in the official repo.

Deploy yaml consolidation

Hardware-tier deltas (1×H20 vs 4×H20 vs 2×L40S etc.) collapse into platforms: sections inside one deploy/hunyuan_image3_<variant>.yaml per task.
NPU + XPU overlays moved into platforms.npu / platforms.xpu sections of the corresponding CUDA yaml — mirrors qwen3_omni_moe.yaml structure.
FP8 stays as a separate deploy/hunyuan_image3_t2i_fp8.yaml (quantization is not a platform delta).

Field ownership

Following the 2/N decisions:

Pipeline (topology): model_arch=HunyuanImage3ForCausalMM, execution_type, input_sources, final_output*, omni_kv_config (KV transfer between AR↔DiT), kv_transfer_criteria, custom_process_input_func (hunyuan_image3.ar2diffusion on DiT stages), AR stop_token_ids: [127957] as sampling_constraints (model-intrinsic until [Follow-up] Deploy/pipeline config follow-ups from #2383 #2887 item 2 lands).
Deploy: gpu_memory_utilization, devices, tensor_parallel_size, max_num_seqs, default_sampling_params (per-variant AR sampling differs: t2i=greedy, it2i=temp=0.6/top_p=0.95/top_k=1024), DiT num_inference_steps=50, guidance_scale=2.5, hf_overrides.rope_parameters.mrope_section=[0,32,32]. AR stages keep enforce_eager: true per qwen3_omni_moe convention; DiT stages omit the field so cudagraph runs by default.
worker_cls / scheduler_cls are auto-derived from StageExecutionType.LLM_AR via _resolve_execution_mode — not copied.

Cleanup

Deletes:

vllm_omni/model_executor/stage_configs/hunyuan_image3_{t2i,t2i_2gpu,moe,moe_dit_2gpu_fp8,it2i,i2t,t2t}.yaml
vllm_omni/platforms/{npu,xpu}/stage_configs/hunyuan_image3_t2i.yaml

Updates:

Examples and tests that reference the old yaml paths now point to vllm_omni/deploy/.
tests/e2e/offline_inference/stage_configs/hunyuan_image3_dit_only_ci.yaml → tests/e2e/offline_inference/deploy/hunyuan_image3_dit_only_ci.yaml (renamed for consistency with the new schema).
examples/offline_inference/hunyuan_image3/end2end.py switched to Omni.from_cli_args(args, parser=parser, **overrides) so argparse defaults don't silently clobber deploy YAML values (override-precedence revisited in [RFC] Sentinel-default precedence for stage engine args #3035 post-0.20.0).

Coordination

Independent of #2977 (HunyuanImage3 has a real config.json at the repo root, so model-type detection works without diffusers_class_name).

Test plan

pre-commit run --files <changed-files> passes
pytest tests/config/test_pipeline_registry.py -v
CI green
Manual e2e: t2i, it2i, dit_only on H20

cc @alex-jw-brooks @hsliuustc0106 @nussejzz @TaffyOfficial @xuechendi @xiaohajiayou

lishunyang12 · 2026-04-21T13:53:49Z

No GPU to test. Awaiting

chatgpt-codex-connector · 2026-04-21T13:54:02Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

lishunyang12 · 2026-04-21T13:55:12Z

@alex-jw-brooks @xiaohajiayou PTAL

hsliuustc0106 · 2026-04-21T15:37:46Z

cc @kechengliu97

hsliuustc0106 · 2026-04-21T15:37:58Z

cc @Semmer2

hsliuustc0106

Blocker scan

Category	Result
Correctness	BLOCK
Reliability/Safety	PASS
Breaking Changes	BLOCK
Test Coverage	PASS
Documentation	BLOCK
stage_config.py wiring	PASS

Blocking issues

1. i2t and t2t modes deleted without migration

PR description says "Five separate model_types" and explicitly lists:

hunyuan_image3_i2t — AR only (stage 0) — Replaces i2t.yaml
hunyuan_image3_t2t — AR only (stage 0) — Replaces t2t.yaml

But neither is registered in pipeline_registry.py, no pipeline definition exists in pipeline.py, and both yamls are simply deleted. The README and end2end.py also remove those modalities entirely.

This is a breaking change for existing users. Either:

Add hunyuan_image3_i2t and hunyuan_image3_t2t pipeline definitions + deploy yamls, OR
Update the PR description to explicitly state these modes are intentionally dropped (not "Replaced")

2. FP8 deploy yaml mentioned but not included

PR body says:

FP8 stays as a separate deploy/hunyuan_image3_t2i_fp8.yaml (quantization is not a platform delta).

No such file exists in this diff. Both moe_dit_2gpu_fp8.yaml (2x H200 FP8 DiT) and t2i_2gpu.yaml (2-GPU AR) are deleted without replacement. Users on 2-GPU or FP8 setups lose their configs.

3. PR description / implementation mismatch

The description states 5 model_types; only 3 are implemented. The description lists NPU/XPU overlay consolidation, but only the t2i deploy yaml gets platform sections — it2i gets none (was this intentional?).

Non-blocking notes

stage_config.py changes look correct — omni_kv_config and requires_multimodal_data are new explicit wire-ups from StagePipelineConfig fields to engine_args/runtime, needed for the pipeline.py framework. No regression risk for existing yaml-based models.
hf_architectures (renders as *** in some tools) correctly used on T2I only for model-type fallback.
Pipeline topology definitions are clean and well-documented.
dit_only_ci.yaml correctly uses the new schema for the e2e test.
XPU/NPU consolidation into platforms: sections is the right pattern.

alex-jw-brooks

Thanks, I think it looks good - some thoughts, I think the text2text/img2text stuff in the earlier review are also important

alex-jw-brooks · 2026-04-21T18:29:52Z

+      enable_expert_parallel: false
+    vae_use_slicing: false
+    vae_use_tiling: false
+    cache_backend: null


I'm actually not sure this is the right default value for cache_backend, it might be currently be "none" as a string (e.g., based on places like this)

Since this and some of the others are default values though, I think it would be best to remove them where possible, since it makes the configs noisier

Good catch — yeah, cache_backend: null / cache_config: null / enable_cache_dit_summary: false were all defaults. Removed in df65e00. Same for the matching it2i values.

alex-jw-brooks · 2026-04-21T18:37:13Z

+    devices: "4,5,6,7"
+    parallel_config:
+      tensor_parallel_size: 4
+      enable_expert_parallel: false


Why is expert parallel disabled in this config, but enabled in hunyuan_image3_it2i.yaml?

Copy-paste asymmetry — no real reason. Aligned t2i stage 1 to enable_expert_parallel: true to match it2i in df65e00.

alex-jw-brooks · 2026-04-21T18:37:41Z

+  - stage_id: 0
+    max_num_seqs: 1
+    gpu_memory_utilization: 0.95
+    enforce_eager: true


Also curious about enforce_eager=True here

Stage 0 is the AR/MoE side — kept enforce_eager: true per the qwen3_omni_moe.yaml convention (cudagraph capture is unreliable across MoE expert routing during AR token-by-token generation). Flipped stage 1 (DiT) in df65e00 to fall through to the dataclass default False so cudagraph runs there.

alex-jw-brooks · 2026-04-21T18:41:57Z

+        devices: "0,1,2,3,4,5,6,7"
+        parallel_config:
+          tensor_parallel_size: 8
+          enable_expert_parallel: true


It may be a good idea to add the NPU config here too, since there was one before. I only see an NPU section in the CI config

Done in df65e00 — ported the deleted platforms/npu/stage_configs/hunyuan_image3_t2i.yaml into a platforms.npu section under platforms.xpu.

Update on this — reversed in 8d0142e after @TaffyOfficial pointed out platform overlays are stage-wise patches, not full stage-list replacements. An NPU DiT-only override landing on the AR→DiT base would silently leave the AR stage running on NPU.

Split fix: dropped the platforms.npu block from hunyuan_image3_t2i.yaml (kept a pointer comment up top) and ported it to vllm_omni/deploy/hunyuan_image3_dit_only.yaml instead. Users on NPU run --pipeline hunyuan_image3_dit_only with the shipped dit_only deploy. Sorry for the back-and-forth.

kechengliu97 · 2026-04-22T02:02:50Z

Looks good. It is necessary to extract the common part rather than "one strategy, one yaml file".

TaffyOfficial · 2026-04-22T02:15:37Z

Re: description says "users can pass --pipeline hunyuan_image3_ with a custom deploy yaml" — but i2t/t2t have no pipeline definition in pipeline.py after this PR, only dit_only does. A user bringing their own deploy yaml would still hit registry lookup failure. If the intent is "BYO", please keep the pipeline.py topology entries for i2t/t2t (small addition, no yaml cost) and drop only the deploy yamls.

TaffyOfficial · 2026-04-22T02:15:57Z

tests/e2e/.../stage_configs/dit_only_ci.yaml reintroduces the stage_configs/ directory name that #2383 explicitly deprecated. Suggest renaming to tests/e2e/.../deploy/hunyuan_image3_dit_only_ci.yaml to stay consistent with the new schema — otherwise follow-up 2c's cleanup will miss it.

TaffyOfficial · 2026-04-22T02:21:57Z

One small design thought — feel free to ignore if this is already settled.The official HunyuanImage-3 repo uses generate_image(prompt, bot_task="image") for T2I, which maps to the DIT_ONLY path rather than going through AR→DiT. I ran into this while setting up a GenEval CI on a fork and ended up registering DIT_ONLY as the default for hunyuan_image_3_moe so that vllm serve ... --omni would Just Work out of the box for T2I users.The current PR keeps hunyuan_image3_t2i as AR→DiT and exposes hunyuan_image3_dit_only as a separate model_type, which is cleaner semantically but means users have to know to pass --pipeline hunyuan_image3_dit_only to match official behavior.Not saying one is right and the other wrong — both have merit. Just thought it'd be worth a sentence in the PR description explaining the choice, so downstream users know which path matches the Tencent reference.

xiaohajiayou · 2026-04-22T07:23:37Z

It may be necessary to use Omni.from_cli_args() here; otherwise, it won’t be possible to distinguish CLI arguments explicitly provided by the user.

Done in df65e00 — switched to Omni.from_cli_args(args, parser=parser, **overrides). Note: this distinction is being revisited under #3035 (sentinel-default precedence) post-0.20.0; for now matching today's convention.

It may be necessary to use Omni.from_cli_args() here; otherwise, it won’t be possible to distinguish CLI arguments explicitly provided by the user.

Hi @xiaohajiayou, I sent you an email requesting for adding me on Wechat to facilicate further conversion.

lishunyang12 · 2026-04-22T15:45:53Z

Pushed df65e00 addressing the open review items:

Blockers (cc @hsliuustc0106)

Added HUNYUAN_IMAGE3_I2T_PIPELINE and HUNYUAN_IMAGE3_T2T_PIPELINE (AR-only topologies) to pipeline.py and registered them. No default deploy yaml — BYO per @TaffyOfficial's suggestion (hardware sizing for I2T/T2T depends on use case).
Added vllm_omni/deploy/hunyuan_image3_t2i_fp8.yaml (2x H200 FP8) — this was promised in the PR body but missing from the diff.
Updated PR description to match the actual implementation (5 model_types, FP8 yaml, NPU section, T2I path note).

Inline (cc @alex-jw-brooks)

Removed cache_backend / cache_config / enable_cache_dit_summary defaults from t2i.yaml.
Aligned t2i stage 1 enable_expert_parallel to true (matching it2i — the asymmetry was copy-paste).
Dropped enforce_eager: true from DiT stages in both t2i and it2i (falls through to dataclass default False for cudagraph). Kept on AR stages per qwen3_omni_moe convention.
Added platforms.npu section back to t2i.yaml, ported from the deleted NPU overlay.

TaffyOfficial follow-ups

Renamed tests/e2e/offline_inference/stage_configs/hunyuan_image3_dit_only_ci.yaml → tests/e2e/offline_inference/deploy/hunyuan_image3_dit_only_ci.yaml. Updated the test reference.
T2I path design note: kept AR→DiT as hunyuan_image3_t2i (matches the bot_task="text" flow). Users wanting Tencent's bot_task="image" (DIT-only) can pass --pipeline hunyuan_image3_dit_only. Documented in the updated PR description.

xiaohajiayou

Switched examples/offline_inference/hunyuan_image3/end2end.py to Omni.from_cli_args(args, parser=parser, **overrides). The override-precedence design is being revisited in [RFC] Sentinel-default precedence for stage engine args #3035 post-0.20.0.

lishunyang12 · 2026-04-22T15:59:30Z

@TaffyOfficial (re: i2t/t2t topology) — Done in df65e00. Added HUNYUAN_IMAGE3_I2T_PIPELINE and HUNYUAN_IMAGE3_T2T_PIPELINE to pipeline.py and registered both in _OMNI_PIPELINES. BYO deploy yaml works as you described — no registry lookup failure.

Signed-off-by: lishunyang <lishunyang12@163.com>

…i + it2i Image generation is the headline modality. AR-only (i2t/t2t) and DiT-only runs are niche; users can pass --pipeline hunyuan_image3_<variant> with a custom deploy yaml. FP8 toggles via --quantization fp8 (DiT-only path verified; IT2I AR + image FP8 hits an upstream vLLM kernel limitation — see vllm-project#2976). dit_only.yaml moved to tests/e2e/.../stage_configs/ as a CI-only fixture; the dit_only pipeline registration is kept so users can BYO deploy. Signed-off-by: lishunyang <lishunyang12@163.com>

…s cleanup Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 · 2026-04-22T16:04:41Z

@TaffyOfficial (re: stage_configs rename) — Done. Renamed to tests/e2e/offline_inference/deploy/hunyuan_image3_dit_only_ci.yaml and updated test_hunyuanimage3_text2img.py:21 to match.

lishunyang12 · 2026-04-22T16:04:52Z

@TaffyOfficial (re: T2I path) — Good point, hadn't documented this. Updated the PR description with a "T2I path choice" subsection: hunyuan_image3_t2i stays AR→DiT (matches bot_task="text"), and users wanting Tencent's bot_task="image" semantics use --pipeline hunyuan_image3_dit_only with their own deploy yaml. Thanks for flagging.

TaffyOfficial · 2026-04-23T07:23:47Z

The platforms.npu section in hunyuan_image3_t2i.yaml looks potentially inconsistent with the top-level pipeline choice.
The deploy still targets pipeline: hunyuan_image3_t2i (AR → DiT), but the NPU comment says “DiT only — single-stage NPU deployment” and only overrides stage_id: 0.
Given the current platform override behavior is stage-wise patching rather than replacing the full stage list, this seems likely to keep the other base stage(s) unless explicitly overridden.
Could you clarify whether the comment is stale, or whether the NPU path is intended to be truly DiT-only? If it is meant to be DiT-only, it may be better to point it at hunyuan_image3_dit_only or make the intent explicit in the config layout.

2.One coverage concern: the e2e text2img test now points to hunyuan_image3_dit_only_ci.yaml, so it no longer exercises the shipped default hunyuan_image3_t2i.yaml nor the AR→DiT KV-transfer path that this PR makes the default T2I route.
Since the refactor’s main semantic choice is exactly that default path, it would be good to keep at least one test covering the shipped hunyuan_image3_t2i deploy (even if the CI-only DiT-only fixture stays for cost/runtime reasons).

Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 · 2026-04-24T12:04:08Z

@TaffyOfficial — both addressed in 8d0142e.

NPU section on t2i.yaml was semantically wrong (you were right — platform overlays are stage-wise patches, not full replacements, so an NPU DiT-only override landing on the AR→DiT base was broken). Split fix: dropped the platforms.npu block from hunyuan_image3_t2i.yaml with a pointer comment, and shipped vllm_omni/deploy/hunyuan_image3_dit_only.yaml as the proper home for the DiT-only NPU deployment (CUDA 4x H20 default + NPU 8x A3-64G section). Registry comment updated; hunyuan_image3_dit_only now has a default deploy.
Coverage gap on shipped AR→DiT t2i yaml — added TestHunyuanImage3ShippedDeploys in tests/test_config_factory.py: parametrized parse + pipeline-registry resolution + stage-topology validation across all three shipped deploys (t2i / it2i / dit_only), plus a targeted check that the t2i deploy wires stage 1 to consume stage 0 (KV-transfer path) and pins the 4 AR + 4 DiT placement. Catches schema regressions on the AR→DiT default without burning 8 GPUs. A full e2e on the AR→DiT path needs new golden CLIP embeddings — tracked as a follow-up.

TaffyOfficial · 2026-04-25T11:45:36Z

@TaffyOfficial — both addressed in 8d0142e.

NPU section on t2i.yaml was semantically wrong (you were right — platform overlays are stage-wise patches, not full replacements, so an NPU DiT-only override landing on the AR→DiT base was broken). Split fix: dropped the platforms.npu block from hunyuan_image3_t2i.yaml with a pointer comment, and shipped vllm_omni/deploy/hunyuan_image3_dit_only.yaml as the proper home for the DiT-only NPU deployment (CUDA 4x H20 default + NPU 8x A3-64G section). Registry comment updated; hunyuan_image3_dit_only now has a default deploy.

Coverage gap on shipped AR→DiT t2i yaml — added TestHunyuanImage3ShippedDeploys in tests/test_config_factory.py: parametrized parse + pipeline-registry resolution + stage-topology validation across all three shipped deploys (t2i / it2i / dit_only), plus a targeted check that the t2i deploy wires stage 1 to consume stage 0 (KV-transfer path) and pins the 4 AR + 4 DiT placement. Catches schema regressions on the AR→DiT default without burning 8 GPUs. A full e2e on the AR→DiT path needs new golden CLIP embeddings — tracked as a follow-up.

Thanks, this addresses the two main concerns. The NPU split looks semantically correct now, and the structural coverage for the shipped AR→DiT path is a reasonable CI-cost compromise.

Two small follow-ups:

The PR body / pipeline.py docstring still seem to say dit_only has no shipped default deploy, which is now stale after adding vllm_omni/deploy/hunyuan_image3_dit_only.yaml.
Since these are shipped deploys, the new tests probably should assert the files exist rather than pytest.skip if missing; otherwise deleting a shipped yaml would silently skip the coverage.

… yaml Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 · 2026-04-27T08:37:31Z

@TaffyOfficial — both addressed in e6d526b.

Updated the pipeline.py module docstring: t2i / it2i / dit_only now ship default deploy yamls (with NPU overlay on dit_only); only i2t / t2t are BYO. PR description already reflects this.
Swapped the two pytest.skip paths in TestHunyuanImage3ShippedDeploys for assert deploy_path.exists() — deleting a shipped yaml now fails CI instead of silently skipping.

lishunyang12 · 2026-04-27T08:44:27Z

@TaffyOfficial Can you help test on npu side? I will conduct e2e test again today to push towards ready state.

TaffyOfficial · 2026-04-27T08:52:40Z

@TaffyOfficial您能帮忙测试一下CPU端吗？我今天会再次进行端到端测试，以尽快达到就绪状态。

今天可能来不及，明天应该可以

Signed-off-by: lishunyang <lishunyang12@163.com>

TaffyOfficial · 2026-04-27T09:33:07Z

@lishunyang12

测试结果总结

PR #2989 CPU 端测试完成：

✅ 通过的测试

tests/config/test_pipeline_registry.py：9/9 passed (10.77s)
- 所有 pipeline registry 测试通过（lazy loading、dynamic registration、central registry）

❌ 失败的测试

tests/test_config_factory.py：85 passed, 5 failed (7.44s)

5 个失败：

TestSentinelDefaultPrecedence::test_none_value_skipped_yaml_wins

TestHunyuanImage3ShippedDeploys::test_shipped_deploys_parse_and_resolve[hunyuan_image3_t2i.yaml-...]
3. TestHunyuanImage3ShippedDeploys::test_shipped_deploys_parse_and_resolve[hunyuan_image3_it2i.yaml-
...]
4. TestHunyuanImage3ShippedDeploys::test_shipped_deploys_parse_and_resolve[hunyuan_image3_dit_only.y
aml-...]
5. TestHunyuanImage3ShippedDeploys::test_t2i_ar_dit_topology

关键错误：
ValueError: Pipeline 'hunyuan_image3_t2i' has async_chunk=True in deploy but no stage
declares a next-stage input processor (async_chunk_process_next_stage_input_func or
custom_process_next_stage_input_func). Either set async_chunk=False or implement an
async-chunk processor on the pipeline.

位置：vllm_omni/config/stage_config.py:788

这是 PR 的真实 bug——HunyuanImage3 的 deploy config 启用了 async_chunk=True，但 pipeline.py
没有声明对应的 async chunk processor。

tests/config/test_pipeline_registry.py │ 9/9 passed ✅ │ 10.77s

tests/test_config_factory.py │ 85 passed, 5 failed ❌ │ 7.44s

5 个失败均为同一根因：merge_pipeline_deploy() 在 stage_config.py:788 校验时报错——HunyuanImage3 的
deploy config 里 async_chunk=True，但 pipeline.py 没有声明 async_chunk_process_next_stage_input_func
或 custom_process_next_stage_input_func。

Signed-off-by: lishunyang <lishunyang12@163.com>

…ne_args Signed-off-by: lishunyang <lishunyang12@163.com>

…ide recipe Signed-off-by: lishunyang <lishunyang12@163.com>

Signed-off-by: lishunyang <lishunyang12@163.com>

…onfig" This reverts commit be9e145.

…of override recipe" This reverts commit 5f6c7ce.

…ge3 end2end" This reverts commit bc98230.

… H200 Signed-off-by: lishunyang <lishunyang12@163.com>

…iT on 2x H200" This reverts commit faeb67f.

Signed-off-by: lishunyang <lishunyang12@163.com>

…olocation profiling race Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 · 2026-04-28T19:32:58Z

Closed and took over by #3172

lishunyang12 changed the title ~~[Config Refactor 4/N] HunyuanImage3 pipeline configs~~ [Config Refactor] HunyuanImage3 pipeline configs Apr 21, 2026

lishunyang12 marked this pull request as ready for review April 21, 2026 13:53

lishunyang12 requested a review from hsliuustc0106 as a code owner April 21, 2026 13:53

hsliuustc0106 reviewed Apr 21, 2026

View reviewed changes

alex-jw-brooks reviewed Apr 21, 2026

View reviewed changes

lishunyang12 added this to the v0.20.0 milestone Apr 22, 2026

xiaohajiayou reviewed Apr 22, 2026

View reviewed changes

lishunyang12 added 3 commits April 23, 2026 00:03

[Config Refactor 4/N] HunyuanImage3 pipeline configs

dd0587a

Signed-off-by: lishunyang <lishunyang12@163.com>

Address review: add i2t/t2t pipelines, FP8 yaml, NPU section, default…

dbadc5c

…s cleanup Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 force-pushed the config-refactor-4-hunyuan-image3 branch from df65e00 to dbadc5c Compare April 22, 2026 16:04

Merge branch 'main' into config-refactor-4-hunyuan-image3

ef96965

lishunyang12 mentioned this pull request Apr 22, 2026

[Config Refactor]: Remove bagel yaml #2936

Merged

11 tasks

Address review: ship dit_only deploy yaml, drop stale NPU on t2i

8d0142e

Signed-off-by: lishunyang <lishunyang12@163.com>

This was referenced Apr 25, 2026

[Config Refactor] Remove legacy Omni CLI arg helper and align tests with nullified parser defaults #3144

Merged

[Follow-up] Deploy/pipeline config follow-ups from #2383 #2887

Open

Address review: dit_only deploy docstring + hard-fail missing shipped…

e6d526b

… yaml Signed-off-by: lishunyang <lishunyang12@163.com>

Merge branch 'main' into config-refactor-4-hunyuan-image3

db8a101

Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 added 14 commits April 27, 2026 17:47

Drop t2i_fp8 yaml; document FP8 recipe via --stage-overrides

bfcd436

Signed-off-by: lishunyang <lishunyang12@163.com>

Add --quantization and --stage-overrides flags to hunyuan_image3 end2end

bc98230

Signed-off-by: lishunyang <lishunyang12@163.com>

Set async_chunk: false on hunyuan_image3 deploy yamls

62c6d5b

Signed-off-by: lishunyang <lishunyang12@163.com>

Restore t2i_fp8 yaml: --quantization is stripped by strip_parent_engi…

7a55adc

…ne_args Signed-off-by: lishunyang <lishunyang12@163.com>

Allow --quantization through CLI; drop t2i_fp8 yaml in favor of overr…

5f6c7ce

…ide recipe Signed-off-by: lishunyang <lishunyang12@163.com>

Fix FP8 recipe in t2i.yaml header: DiT TP lives in parallel_config

be9e145

Signed-off-by: lishunyang <lishunyang12@163.com>

Revert "Fix FP8 recipe in t2i.yaml header: DiT TP lives in parallel_c…

2e87df8

…onfig" This reverts commit be9e145.

Revert "Allow --quantization through CLI; drop t2i_fp8 yaml in favor …

85bee7a

…of override recipe" This reverts commit 5f6c7ce.

Revert "Add --quantization and --stage-overrides flags to hunyuan_ima…

57a67c1

…ge3 end2end" This reverts commit bc98230.

Lower AR gpu_memory_utilization on t2i_fp8 to fit alongside DiT on 2x…

faeb67f

… H200 Signed-off-by: lishunyang <lishunyang12@163.com>

Revert "Lower AR gpu_memory_utilization on t2i_fp8 to fit alongside D…

5b6a2f9

…iT on 2x H200" This reverts commit faeb67f.

Flatten parallel_config schema in hunyuan_image3 deploy yamls

0cf1f5a

Signed-off-by: lishunyang <lishunyang12@163.com>

Add hunyuan_image3_dit_only_2gpu deploy yaml

08ae286

Signed-off-by: lishunyang <lishunyang12@163.com>

Set explicit kv_cache_memory_bytes on AR stage of t2i_fp8 to bypass c…

eaf2a95

…olocation profiling race Signed-off-by: lishunyang <lishunyang12@163.com>

lishunyang12 mentioned this pull request Apr 28, 2026

[Config] Add HunyuanImage3 deploy configs #3172

Merged

lishunyang12 closed this Apr 29, 2026

Conversation

lishunyang12 commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Variant strategy

T2I path choice

Deploy yaml consolidation

Field ownership

Cleanup

Coordination

Test plan

Uh oh!

lishunyang12 commented Apr 21, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 21, 2026

Uh oh!

lishunyang12 commented Apr 21, 2026

Uh oh!

hsliuustc0106 commented Apr 21, 2026

Uh oh!

hsliuustc0106 commented Apr 21, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Blocker scan

Blocking issues

1. i2t and t2t modes deleted without migration

2. FP8 deploy yaml mentioned but not included

3. PR description / implementation mismatch

Non-blocking notes

Uh oh!

alex-jw-brooks left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kechengliu97 commented Apr 22, 2026

Uh oh!

TaffyOfficial commented Apr 22, 2026

Uh oh!

TaffyOfficial commented Apr 22, 2026

Uh oh!

TaffyOfficial commented Apr 22, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lishunyang12 commented Apr 22, 2026

Uh oh!

lishunyang12 commented Apr 22, 2026

Uh oh!

lishunyang12 commented Apr 22, 2026

Uh oh!

lishunyang12 commented Apr 22, 2026

Uh oh!

TaffyOfficial commented Apr 23, 2026

Uh oh!

lishunyang12 commented Apr 24, 2026

Uh oh!

TaffyOfficial commented Apr 25, 2026

Uh oh!

lishunyang12 commented Apr 21, 2026 •

edited

Loading